Modeling the Effects of Students’ Interactions with 
Immersive Simulations using Markov Switching Systems 


Nicholas Hoernle 
Harvard University 


nhoernle@g.harvard.edu 


Pavlos Protopapas 
Harvard University 


pavlos@seas.harvard.edu 


ABSTRACT 


Simulations that combine real world components with inter- 
active digital media provide a rich setting for students with 
the potential to assist knowledge building and understand- 
ing of complex physical processes. This paper addresses the 
problem of modeling the effects of multiple students’ simul- 
taneous interactions on the complex and exploratory envi- 
ronments such simulations provide. We work towards assist- 
ing educators with the difficult task of interpreting student 
exploration. We represent the system dynamics that result 
from student actions with a complex time series and use 
switch based models to decompose the time series into indi- 
vidual periods that target interpretability for teachers. The 
model learns the transition points between successive peri- 
ods in the time series as well as the internal dynamics that 
govern each period. This model differs from other switch 
based models in that it decomposes the time series in a way 
that is human interpretable. This approach was applied to 
data that was obtained from an existing multi-person simu- 
lation with pedagogical goals of teaching sustainability and 
systems thinking. A visualization of the model was designed 
to validate the interpretability of the generated text-based 
descriptions when compared to a movie representation of the 
system dynamics. A pilot study using this visualization indi- 
cates that the switch based model finds relevant boundaries 
between salient periods of student work. 
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1. INTRODUCTION 


Complex systems simulations are becoming increasingly com- 
mon in formal and informal STEM learning environments [21]. 
These simulations present scientific phenomena in a manner 
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that bridges principles of science and the firsthand experi- 
ence of emergent, real-world outcomes. However, the open- 
ended and exploratory nature of these simulations presents 
challenges to teachers’ understanding of students’ learning. 
Students’ actions have immediate and long-term effects on 
the simulation leading to a rich array of emergent outcomes. 
Teachers may wish to discuss students’ interactions to high- 
light salient learning opportunities, but if there are too many 
“moving parts” to the simulation, this becomes a challenging 
ideal. 


This paper describes an automatic method for extracting 
salient periods from the log files that are generated by com- 
plex exploratory learning environments (ELE). Our goal is 
to generate relevant summaries of the system dynamics such 
that teachers can effectively engage students in discussions 
that stem from their own experiences with the simulations. 
We study an application of Switching State Space Mod- 
els (SSSM) to the task of extracting salient periods from 
a mixed reality ELE, Connected Worlds, installed at the 
New York Hall of Science (NYSci). SSSMs [7] are a class of 
model for time-series data where the parameters controlling 
a linear dynamic system switch according to a discrete la- 
tent process. These models have seen use in a wide variety 
of domains including control [11], statistics [2], economet- 
rics [8] and signal processing [14]. SSSMs combine hidden 
Markov and state space models to capture regime switch- 
ing in non-linear, continuous valued time series [22]. The 
intuition is that a system evolves over time but may un- 
dergo a regime change that results in an intrinsic shift in 
the system’s characteristics. Allowing for discrete points in 
time where the dynamics change, enhances the power of the 
simple linear models to capture more complicated dynam- 
ics. We propose that regime switching models also help to 
increase the interpretability of large and complex systems 
by automatically segmenting a time series into regions of 
approximately uniform dynamics. The result is that a com- 
plex session is broken into smaller periods that are more 
readily understood upon reflection on the session. 


In this paper we introduce the Connected Worlds ELE and 
explain why teachers might need assistance when leading a 
discussion with the students where they reflect upon their 
actions. We expound on the SSSM and propose a method 
for decomposing a complex time series into smaller periods 
aiming to assist teachers when reflecting on a session with 
a class. We lastly present results showing the efficacy of 
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Figure 1: Bird’s eye snapshot view taken from 
the movie representation of the CW environment. 
Biomes are labeled on the perimeter and logs appear 
as thick red lines. Water enters via the waterfall and 
in this image it mainly flows toward the desert and 
the plains. 


our approach on both synthetic data and on data collected 
from CW. The CW validation is a preliminary study with 
significant results which suggest that the model output is 
human interpretable. 


2. CONNECTED WORLDS 


Connected Worldst (CW) is a multi-person ecology simula- 
tion with the goal of teaching students about complex sys- 
tems and systems thinking. It consists of an immersive en- 
vironment comprising four interconnected biomes connected 
by a central flow of water that is fed by a waterfall. The sim- 
ulation exhibits large scale feedback loops and presents the 
opportunity for participants to experience how their actions 
can have (often unintended) effects that are significantly re- 
moved in time and-or space. Students plant trees which 
flourish or die, animals arrive or depart, and rain clouds 
form, move through the sky and deposit rain into the wa- 
terfall. 


Students interact with CW by positioning logs to control the 
direction of the water that flows in the simulation. Water 
can be directed to each of the four biomes (desert, plains, 
jungle, wetlands) and the distribution of flowing water de- 
pends on the placement of the logs. Water enters the simu- 
lation in two ways. The students can actively release water 
into the system from the stored water in the reservoir. Rain- 
fall events are out of the students’ control and these release 
water into the waterfall (to replenish the primary source of 
water) and into the individual biomes. 


Figure 1 shows a bird’s eye snapshot view of the state of 
the simulation in CW. The nature of the simulation is com- 
plex on a variety of dimensions. The simulation involves a 
large number of students simultaneously executing actions 
that change the state of the simulated environment. No one 
person - including the teacher or interpreter - can possibly 
follow what happens, even in a relatively short simulation. 
Each participant will have a different view of what tran- 


‘https: //nysci.org/home/exhibits/connected-worlds/ 


spired, depending on the actions s/he took and the state 
changes that resulted. Thus it is important to develop tools 
that can support teachers’ understanding of the effects of 
students’ interactions in complex ELEs such as CW. 


3. RELATED WORK 


This work is related to two separate strands of research: 
studying students’ interactions in mixed reality ELEs, and 
modeling complex systems using switching models. 


There is increasing evidence of the value of multi-person par- 
ticipatory simulations for engaging learners with complex 
science topics [9, 1, 23]. Research has explored classroom- 
scale participatory simulations where students play active 
roles in the simulation. Some examples include topics in dis- 
ease transmission [3] and human body systems [12]. Other 
work has placed students in the role of scientists experiment- 
ing with simulated ecosystems [17, 4]. Within all of these 
examples, learners both engaged directly with the simulation 
during enactment, and reflected on their actions afterward 
to better understand how their choices resulted in observed 
system outcomes. Research has shown that using data ob- 
tained from students’ own performances has the potential 
to engage them more effectively than presenting them with 
the results of an abstract simulation [16, 15]. Building on 
this work, our eventual goal is to provide assistive tools for 
teachers to further enhance the pedagogical impact that such 
ELEs can achieve. 


Much work has been completed in the field of mining mean- 
ingful knowledge from time series data [5, 10, 19]. Ghahra- 
mani and Hinton [7] introduce and give a detailed presenta- 
tion of the SSSM. We adapt this model to the special struc- 
ture that is inherent in CW. Cappé et al. [2] and Giordani 
et al. [8] use switching models to capture non-linear behav- 
ior in a time series. SSSMs have been effectively applied 
in object tracking domains where it is necessary to predict 
the trajectory of various objects. Whiteley et al. [22] intro- 
duce a sequential Monte Carlo algorithm for inference over 
switching state space models using discrete particle filters. 
We present a new avenue of study in which SSSM models 
are used to describe complex time series in a way that can 
be easily interpreted by people. 


4. SWITCHING STATE SPACE MODELS 


SSSMs are commonly used to describe time series? with non- 
linear dynamics in econometrics and signal processing appli- 
cations [8, 14]. A SSSM includes M latent continuous valued 
state space models and a discrete valued switching variable. 
Each of the models, which we refer to as regimes, have their 
own dynamics. At each point in time, the switching variable 
selects one of the individual state-space models to generate 
an observation vector. 


The SSSM is formalized as: 
xf = Bmx 4 af 


(m) y%(m) () 
Yt => S:A Xt + Ut 


Here, eile denotes the latent continuous valued state for 


Refer to Shumway and Stoffer [20] for a detailed discussion 
of time series analysis models. 
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Figure 2: Graphical model for the switching-state 
space model. A latent discrete switching variable 
(S;) selects an active, continuous state space model 


xi). The observation vector (¥;) depends on the 
active regime at time t. 


regime m at time t. S; is a switching variable that selects 
the m‘” regime such that regime m at time t produces ob- 
servation vector Y;, which depends on the latent state 5 uae 


The states xm evolve over time in a way that depends on 
the transition matrix 6“ and the previous state X,_1. Fig- 
ure 2 presents a graphical representation of an SSSM. Edges 
between variables represent stochastic casual relationships. 
Not shown in the figure are the regime dependent transition 
noise w™ and the observation noise v;. A™) is the output 
matrix in the state space formulation, set to identity matrix 
TI in our case. 


We illustrate how an SSSM can describe the effects of stu- 
dents’ interactions in CW. Y; represents the observed water 
level in the different areas of the simulation at time t. X. a 
describes the expected levels of water under regime m at 
time t. ®°”) controls the water flow in the simulation ac- 
cording to the transitions in regime m. 5S; selects which of 
the regimes to use to describe the water level Y;:. 


Importantly, a single regime is insufficient for modeling the 
effects of students’ interactions with CW. This is because 
students’ actions have a complex impact on the system dy- 
namics. We therefore need to define multiple regimes, where 
each regime describes a series of events that can be (stochas- 
tically) explained by the regime dynamics. A regime is ac- 
tive for a duration of time in CW; we call this duration a 
period. For example, in one period water is mainly flowing 
to the plains and to the desert (as is shown in figure 1). In 
the next period, students move the logs to re-route water 
flow to the wetlands potentially because plant life is dying. 
Each of these periods might be active for different durations 
and their dynamics are described by different regimes. 


4.1 Exploiting Model Structure 


We aim to perform inference over the latent states, xi, the 
regime parameters, ei”) and latent switching variable, S;. 
Computing posterior distributions for SSSM is computation- 
ally intractable [18]. To illustrate, in figure 2 we see that the 
graph consists of M state space models that are marginally 


independent. These models become conditionally dependent 
when Y; is observed, as is the case in this graph. The re- 
sult is that xe is conditionally dependent on the value 
of all of the other latent states and switching variables for 
times 1 through T and regimes 1 through M [18]. Previous 
approaches use approximate methods such as variational in- 
ference [7] and a ‘merging of Gaussians’ [14, 18] to address 
the inference problem. The variational inference approxima- 
tion transforms the intractable Bayesian expectation prob- 
lem into an optimization problem by minimizing the Kull- 
back Leibler (KL) divergence between a simpler family of 
approximating distributions and the unknown, intractable 
posterior. The merging of Gaussians approach uses a single 
Gaussian to represent the mixture of M Gaussians at each 
time step thereby simplifying the computation with the cost 
of being susceptible to local optima (see section 5.1). 


While these methods have seen success in previous examples, 
they cannot be applied to our domain. This is because they 
allow the system to switch back and forth between regimes, 
resulting in frequent regime changes that can hinder the in- 
terpretability of the model output. This work takes a differ- 
ent approach by imposing structure on the model to address 
both inference and interpretability challenges. Further, as 
the optimization procedures of the previous work are suscep- 
tible to local optima, we rather use a Markov chain Monte 
Carlo (MCMC) approach to approximate the posterior dis- 
tribution of the latent parameters. 


We make two assumptions, which arise from the need to 
create human interpretable descriptions of complex system 
behavior. Assumption 1: the system advances through a 
series of regimes, each regime is active for a period, and then 
switches to an entirely new regime, one that has not been 
used before. Assumption 2: the regime remains active 
for the maximum possible time for which it can be used to 
describe the period. 


To illustrate, without making these assumptions there are MW 
possible assignments of regimes for each time step, making 
a total of M? combinations of possible assignments, which 
is exponential in the number of time steps. Moreover, in 
the worst case, the number of possible periods is bounded 
by T with a switch at every time step. In contrast, under 
our assumptions, there are only two possible assignments of 
regimes for each time step (i.e., do we stay in the current 
regime or do we progress to the next regime), making for 
a total of 2” combinations of possible assignments, where 
M is constant. The number of possible periods under this 
methodology is bounded by M. We hypothesize that the 
forced reduction in complexity of the fitted model would 
significantly simplify the interpretability of the model for a 
human. 


4.2 Algorithm for Posterior Inference 
Computing the posteriors in an SSSM corresponds to ap- 
proximating the joint distributions over pee and @!™) given 
the observation vector Y. A well known problem with MCMC 
inference in complex graphical models with hidden vari- 
ables is that of identifiability [13]. Models are nonidenti- 
fiable when two sets of parameters can explain the observed 
data equally well. For example, in a simple Gaussian mix- 
ture model with means Jo, 41 and covariances No, +1, the 
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marginal posterior distributions of the parameters are iden- 
tical. A possible solution to the identifiability problem is to 
add constraints (e.g. enforcing juo > 141). However, defining 
constraints in higher-dimensional domains is non-trivial. 


Another solution for solving the identifiability problem is to 
provide labels for part of the data. This is termed semi- 
supervised learning and we incorporate this solution into 
our model. In the context of the CW domain, we can label 
observations as belonging to one regime or another. Let 
Stt41,...,t-14K,t+K be a consecutive set of K state vari- 
ables such that S; and S:4~% have known value assignments 
(regime m and regime m+ 1 respectively). The values for 
the state variables S:41,.....-14.« are unknown. By Assump- 
tion 1, the switch between regimes m and m+ 1 occurs at 
some S$; where t <1 <t+K. Therefore, the value of S$; 
determines the values for all of the unknown states as S; is 
assigned to regime m for t < I and it is assigned to regime 
m-+1 fort > l. 


We provide a sketch of this process in Algorithm 1. Step 1 
initializes the M supervised switch variables, one per regime. 
The labeled switch variables are spaced uniformly in time 
and are assigned to regimes in increasing order according to 
Assumption 1. This uniform method for initialization can 
be justified by Assumption 2, in that any set of regimes that 
provides an interpretable model is sufficient. The number of 
expected time steps in each period is K = T/M, and there 
are K — 2 unlabeled switch variables between each pair of 
switch variables assigned to regimes. 


Step 2 performs MCMC sampling to approximate the poste- 
rior of the model?. For the case when the value of the switch 
variable is known, the posterior of a can be directly sam- 
pled by following the structure of a state space model. In 
the case where the switch variable is unknown, we have a 
marginalization problem over the two possible values of S‘. 
For the hidden Markov model (HMM) structure this can be 
efficiently computed with the forward algorithm [20]. To 
formulate the HMM forward algorithm, we use the obser- 
vation probabilities from the individual state space models 
in place of the emission probabilities of a standard HMM. 
Here, 7g, refers to the belief of the state of the switching 
variable given the evidence up to that point in time. 


Step 3 uses the regime specific parameters 6” to make 
a maximum likelihood assignment of an observation to a 
regime using the Viterbi algorithm [20], thereby specifying 
the value of S;Vt € [1 : T]. 


Algorithm 1 is computed on an SSSM that implements As- 
sumptions 1 and 2. Such a model is shown in figure 3. The 
model depicts a subset of the time series with K time steps 
from time t to time t+ K. There are two supervised labels 
at the boundaries of the subset with the variable S; assigned 
to regime m and variable S;,% assigned to regime m + 1. 
The unknown Kk — 2 states in between are marginalized over 
such that the regime specific posteriors can still be approxi- 
mated. This model is repeated for the M — 1 switches in the 
data. The setup is flexible in that informative priors for the 
model noise and transition matrices can be specified (and 


3Implemented using Stan MC (http://mc-stan. org/) 


Algorithm 1: Posterior inference algorithm 


Input: MM (number of regimes), Y (vector of observations 
for T time steps). 


1 Initialization: Label one datapoint per regime, leaving 


T — (M +1) unlabeled datapoints. 


2 MCMC Inference: Draw samples for Ke &(™ from the 


posterior distribution defined by the structured 
probability model: 
for Y; in Y do 
if S; = ™ is known then 


| sample from P(XL™ O™ | X41, St =m, Y:) 


else 
marginalize over S;. Sample from 
s as, P(X, o | Xt-1, St = i, Yt) 
i=m-1 


3 Posterior Inference: Use the posterior for regime 


parameters (6(™) to run a Viterbi pass on the data Y to 
make a maximum likelihood assignment of the value of S; 
to regime m (thereby learning the switching variables S;). 

Output: S; (assignments to regimes), ®°”) (regime 
posterior distributions). 


Figure 3: Updated graphical model showing the 
semi-supervised switching labels, along with the 
choice of only two chains between two semi- 
supervised points. This representation is repeated 
M — 1 times to describe the M — 1 switches between 
the M regimes. 
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Figure 4: Histogram of the percent of correctly in- 
ferred labels for the observed output. The struc- 
tured sampling Algorithm 1 (a) learns the regime 
labels more accurately than the randomly initialized 
Gaussian merging algorithm (b). 


related) as required by domain knowledge. 


5. EMPIRICAL VALIDATION 


We evaluate two aspects of Algorithm 1. First, we show 
that it finds the true regime labels in a synthetic dataset. 
Thereafter, we use data that were collected from Connected 
Worlds to run a preliminary experiment that tests whether 
the inferred periods are interpretable to human validators. 


5.1 Evaluation on Synthetic Data 

We generate synthetic data to test whether Algorithm 1 
finds a reasonable representation of known switches in an 
SSSM. Equation 2 describes an SSSM with two regimes 
and a continuous state space. The transition parameters 
and regime noise are determined according to the active 
regime. This model is adapted from Ghahramani and Hin- 
ton [7] which describes a state space that is disjoint at regime 
switches; we rather chose to make the state space continu- 
ous at the switch points as this more accurately mimics the 
scenario that is present in CW. The prior probability of each 
of the regimes is 0.5 (p1 = p2 = 0.5); the regime transition 
probabilities are $1,1 = S2,2 = 0.95 and S12 = S2,1 = 0.05". 
We used this model to generate 1000 time series, each with 
200 observations. 


XM =0.99Xi1 tu? wl? ~.N(0,1) 
KX?) = 0.9Xe1+w? wl? ~ .N(0, 10) (2) 
Yt = Se Xt + Ut Ut™ N(0, 0.1) 


We compare the Gaussian merging baseline that is com- 
monly used in the literature [14] to Algorithm 1 with the 
number of regimes initialized to 9. The accuracy of each ap- 
proach is measured as the percentage of the correctly labeled 
data points as belonging to either regime 1 or regime 2. On 
average Algorithm 1 labels 89% of the data correctly, mate- 
rially higher than the 66% average accuracy of the Gaussian 
merging approach. Figure 4 shows a histogram of the cor- 
rectly inferred switch points in the data according to Algo- 
rithm 1 (top) and the baseline (bottom). The bi-modal and 
long tailed distribution for the baseline approach demon- 
strates its susceptibility to local optima. 


4;,% denotes the probability of a switch from regime j to 
regime k. 


0) 50 100 150 200 


Figure 5: An example of a generated time series 
from the SSSM model of Equation 2. The x axis 
represents time, and the y axis shows the observa- 
tions (the magnitudes of the signal are irrelevant 
for this investigation). Regime labels are shown as 
black and gray dots representing the two label op- 
tions. True labels (top) are compared to the inferred 
labels from Algorithm 1 (middle) and the Gaussian 
merging (bottom). 


Figure 5 shows an example of the generated time series 
(top) and the associated switch points (bottom). The switch 
points are shown according to the true model, the points 
inferred by Algorithm 1 and the points inferred by the base- 
line. Each period is represented by a sequence of black and 
gray colored circles. As shown by the figure, the periods 
inferred by Algorithm 1 and the baseline both overlap to 
some extent with the true periods. However, there is sub- 
stantially more noise in the inferred periods of the base- 
line. Algorithm 1 learns the regime autoregressive parame- 
ters ¢1 = 0.97+0.027 and ¢2 = 0.88 + 0.035, again showing 
an effective recovery of the individual regime parameters. 


The superior performance of Algorithm 1 can be directly 
attributed to the switching behavior that is enforced by As- 
sumptions 1 and 2, which was not assumed by the baseline 
model. Although the model structure encourages the dis- 
covery of switches in Algorithm 1 the uniformly spaced la- 
bels should not be seen as a model advantage as no prior 
knowledge of the actual switches is used in performing this 
initialization step. Given that the proposed algorithm finds 
a reasonable representation for the switches in a generated 
dataset, we turn to the evaluation of the interpretability of 
its output within the CW context. 


5.2 Preliminary Validation of Interpretability 


on Connected Worlds Data 
Because the ultimate users of the output of Algorithm 1 
will be teachers leading their students in a discussion of the 
simulation behavior, we wanted to confirm that the inferred 
switch points were interpretable by a human seeking to un- 
derstand the “story” of the simulation. In order to do this, 
we used a movie of the water flow (see figure 1 for one such 
frame) and asked evaluators to select one of three possi- 
ble switch points between every pair of consecutive periods. 
Evaluators saw a composite of 1) the movie of the two pe- 
riods; 2) a description of the dynamics of each of the two 
periods and 3) a set of three possible switch points between 
the periods. The evaluator’s task was to choose the switch 
point that best matched the change in dynamics between 
the two periods. One of the three switch points was that 
inferred by Algorithm 1; the other two were random times 
sampled uniformly from the beginning of the first period to 
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the end of the second period.® 


The descriptions were generated from the inferred parame- 
ters that are an output from Algorithm 1. In equation 1, 
6 refers to the transition matrix for the m‘” regime. As 
is discussed in section 4, the parameters from this matrix 
describe the expected movement of water in the given pe- 
riod. We threshold the values from this matrix to generate 
a short text description for the water movement. One such 
description could be: “Water is directed towards the desert 
and plains. The wetlands and jungle are receiving little or 
no water”. 


Evaluators worked with five sessions, each of which included 
5 to 10 periods of system dynamics. Selecting the correct 
switch point is not a trivial task: it requires distinguish- 
ing between changes in the system that indicate different 
dynamic regimes and those that are noise within the same 
dynamic regime. We see an evaluator’s ability to choose a 
switch point based on the movie and a description of the 
two contiguous periods as evidence that the inferred periods 
are usable by a teacher who wants to guide students in con- 
structing a causal description of their experience with the 
simulation. Moreover, this can be seen as evidence that the 
inferred regime parameters match inferred period bound- 
aries, together presenting a coherent description for the wa- 
ter movement for a short segment of the CW session. 


Figure 6 shows the results of the validation using four eval- 
uators with knowledge of the CW domain. The five sessions 
are shown along the x-axis; the fraction of correctly selected 
switch points is shown by the bin heights. The dashed line 
represents a random baseline in which the selected switch 
probability corresponds to - Under the null hypothesis, 
the performance of an evaluator would not be significantly 
different than the random baseline. The results indicate 
that the evaluators chose the switch point identified by Al- 
gorithm 1 significantly more often than the random baseline 
(p < 1x 10°‘), suggesting that the inferred switch points 
were indeed interpretable to a large extent as meaningful 
changes in the state of the system. The differences in inter- 
pretability seen in figure 6 (e.g. session 4 was more difficult 
to interpret than session 3) can provide further guidance 
to us in how to support teachers and students in making 
sense of their experiences in CW. For example, the sessions 
with more complicated dynamics might need more periods 
to fully capture the progression over time. Predefining the 
number of periods for a given session is an aspect of this ap- 
proach that needs addressing. A more detailed user study is 
left for future work. 


6. CONCLUSION AND FUTURE WORK 


This paper has presented novel research into the simplifica- 
tion of log files that are generated by complex participatory 
immersive simulations. The log files were represented as a 
time series that was decomposed with the long term goal of 
producing periods that are useful for a teacher when leading 
reflective discussions about students’ sessions. We have built 
upon previous time series analysis tools to formulate a model 
that automatically segments a time series into these salient 


Visualization available at https://s3.amazonaws.com/ 
essil-validation/index.html. 
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Figure 6: Expert validation of five different test 
files from sessions with CW. The histogram shows 
the fraction of correctly identified switches between 
automatically identified periods with an expected 
baseline accuracy of z 


periods. The efficacy of the algorithm was demonstrated on 
a synthetic dataset where it outperformed previous work at 
the task of assigning data to regimes. We used the algo- 
rithm’s output to generate a short text description of the 
dynamics in an inferred period. We find that evaluators are 
independently able to validate the inferred changes between 
the automatically generated periods. This preliminary study 
demonstrates that it is possible to simplify a time series log 
into periods of activity that are human interpretable. 


Our focus now rests on designing assistive tools for teachers 
that can facilitate their understanding of students’ inter- 
actions in multi-participant immersive simulations. More- 
over, our results suggest that the model should be capa- 
ble of adapting the number of inferred regimes to the com- 
plexity of a given session. Fox et al. [6] explore a Bayesian 
non-parametric model which allows the data to dictate the 
number of regimes that are inferred. The application of this 
model to the CW data presents an attractive tool for remain- 
ing agnostic about the number of regimes that are present 
in a session. Another avenue for future research involves 
exploring the trade-off that is made between the predictive 
power of a model and the explanatory coherence that the 
model achieves. Wu et al. [24] have suggested a method 
for regularizing deep learning models to facilitate people’s 
understanding of their predictions. This is an important 
balance to consider and one that we intend to consider in 
educational settings. 
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